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Abstract. A branch and bound algorithm is developed for global optimization. Branching 
in the algorithm is accomplished by subdividing the feasible set using ellipses. Lower bounds are 
obtained by replacing the concave part of the objective function by an affine underestimate. A ball 
approximation algorithm, obtained by generalizing of a scheme of Lin and Han, is used to solve 
the convex relaxation of the original problem. The ball approximation algorithm is compared to 
SEDUMI as well as to gradient projection algorithms using randomly generated test problems with 
a quadratic objective and ellipsoidal constraints. 
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1. Introduction. In this paper we develop a branch and bound algorithm for 
the global optimization of the problem 

min /(x) subject to x £ ft, (P) 

where f2 C K" is a compact set and / : 1" — > K is a weakly convex function [25]; 
that is, /(x) + cr||x|| 2 is convex for some a > 0. The algorithm starts with a known 
ellipsoid £ containing f2. The branching process in the branch and bound algorithm 
is based on successive ellipsoidal bisections of the original £. A lower bound for the 
objective function value over an ellipse is obtained by writing / as the sum of a convex 
and a concave function and replacing the concave part by an affine underestimate. 
See [3 [13] for discussions concerning global optimization applications. 

As a specific application of our global optimization algorithm, we consider prob- 
lems with a quadratic objective function and with quadratic, ellipsoidal constraints. 
Global optimization algorithms for problems with quadratic objective function and 
quadratic constraints include those in [TJ [TS] [55] . In [55J Raber considers problems 
with nonconvex, quadratic constraints and with an n-simplex enclosing the feasible 
region. He develops a branch and bound algorithm based on a simplicial-subdivision 
of the feasible set and a linear programming relaxation over a simplex to estimate 
lower bounds. In a similar setting with box constraints, Linderoth [18] develops a 
branch and bound algorithm in which the the feasible region is subdivided using the 
Cartesian product of two-dimensional triangles and rectangles. Explicit formulae for 
the convex and concave envelops of bilinear functions over triangles and rectangles 
were derived. The algorithm of Le pQ focuses on problem with convex quadratic con- 
straints; Lagrange duality is used to obtain lower bounds for the objective function, 
while ellipsoidal bisection is used to subdivide the feasible region. 

The paper is organized as follows. In Section[5Jwe review the ellipsoidal bisection 
scheme of [I] which is used to subdivide the feasible region. Section [3] develops the 
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convex underestimator used to obtain a lower bound for the objective function. Since 
/ is weakly convex, we can write it as the sum of a convex and concave functions: 

/(x) = (/(x)+a||x|| 2 ) + (- < 7||x|| 2 ), (1.1) 

where a > 0. A decomposition of this form is often called a DC (difference convex) 
decomposition (see |13j). For example, if / is a quadratic, then we could take 

a = — min{0, Ai}, 

where Ai is the smallest eigenvalue of the Hessian V 2 /. The concave term — <r||x|| 2 in 
(|1.1[) is underestimated by an affine function £ which leads to a convex underestimate 
f L of / given by 

/ l (x) = (/(x) +( t||x|| 2 )+£(x). (1.2) 

We minimize /z, over the set £ (~1 to obtain a lower bound for the objective function 
on a subset of the feasible set. An upper bound for the optimal objective function 
value is obtained from the best feasible point produced when computing the lower 
bound, or from any local algorithm applied to this best feasible point. Note that 
weak convexity for a real- valued function is the analogue of hypomonotonicity for the 
derivative operator [Jj [HI [21] . 

In Section [4] we discuss the phase one problem of finding a point in Q which 
also lies in the ellipsoid £ . Section [5] gives the branch and bound algorithm and 
proves its convergence. Section [6] focuses on the special case where / and are 
convex. The ball approximation algorithm of Lin and Han [161 117] for projecting a 
point onto a convex set is generalized to replace the norm objective function by an 
arbitrary convex function. Numerical experiments, reported in Section [7J compare 
the ball approximation algorithm to SEDUMI 1.1 as well as to gradient projection 
algorithms. We also compare the branch and bound algorithm to a scheme of An [T] 
in which the lower bound is obtained by Lagrange duality. 

Notation. Throughout the paper, || • || denotes the Euclidian norm. Given 
x, y G K", [x, y] is the line segment connecting x and y: 

[x,y] = {(l-*)x + iy :0<t< 1}. 

The open line segment, which excludes the ends x and y, is denoted (x, y). The 
interior of a set iS is denoted int S, while ri S is the relative interior. The gradient 
V/(x) is a row vector with 



(V/(x))< 



dx,. 



The diameter of a set S is denoted 8(S): 

8{S) =sup {||x-y|| :x, y G S}. 

2. Ellipsoidal bisection. In this section, we give a brief overview of the ellip- 
soidal bisection scheme introduced by An pQ. This idea originates from the ellipsoid 
method for solving convex optimization problems by Shor, Nemirovski and Yudin 
[2"5l |2"7] . Consider an ellipsoid £ with center c in the form 

£ = {x G K" : (x - c) T B- 1 (x - c) < 1}, (2.1) 
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where B is a symmetric, positive definite matrix. Given a nonzero vector v G R™, the 
sets 

H- ={xe£ : v T x < v T c} and H+ = {x G £ : v T x > v T c} 

partition £ into two sets of equal volume. The centers c+ and c and the matrix B± 
of the ellipsoids £± of minimum volume containing H± are given as follows: 

d n 2 / 2dd T \ , Bv 



n + 1 n 2 - 1 V n + ll Vv T Bv 

As mentioned in pQ, if the normal v always points along the major axis of £ , then a 
nested sequence of bisections shrinks to a point. 

3. Bounding procedure. In this section, we obtain an affine underestimate £ 
for the concave function — ||x|| 2 on the ellipsoid 

£ = {x G W l : x T Ax - 2b T x < p}, (3.1) 

where A is a symmetric, positive definite matrix, b G R™, and pel. The set of 
affine underestimates for — ||x|| 2 is given by 

U = {£:R n ^R, £ is affine, - ||x|| 2 > £(x) for all x G £}. (3.2) 

The best underestimate is a solution of the problem 

min max - (llxll 2 + £(x)) . (3.3) 



Theorem 3.1. A solution of (jg.5^ is £*(x.) = -2c T x + 7, where c = A x b is the 
center of the ellipsoid, 7 = 2c T /x — ||/x|| 2 , and 

(i = arg max||x — c|| 2 . (3-4) 
If S(£) is the diameter of £, then 

5{£f 



mm max 



(||x|| 2 +^(x)) = 



leu xes v ' 4 



Proof. To begin, we will show that the minimization in (|3 . 3|) can be restricted to a 
compact set. Clearly, when carrying out the minimization in (|3.3p . we should restrict 
our attention to those £ which touch the function h(x) = — j|x|| 2 at some point in £ . 
Let y € £ denote the point of contact. Since /i(x) > £(x) and h(y) = i?(y), a lower 
bound for the error /i(x) — £(x) over x G £ is 

Mx)-*(x)>|*(x)-*(y)|-|M*)-My)l- 

If M is the difference between the maximum and minimum value of h over £, then 
we have 



ft(x)-*(x)>|*(x)-*(y)|-M. 



(3.5) 
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An upper bound for the minimum in (|3.3p is obtained by the function £ which is 
constant on £, with value equal to the minimum of /i(x) over x G £. If w is a point 
where h attains its minimum over £, then we have 

max/i(x) — ^o(x) = maxft(x) — h(w) = M. 

For x S £, we have 

h(x) - £(x) < max /i(x) - ^(x) < max /i(x) - £ (x) = M (3.6) 

when we restrict our attention to affine functions I which achieve an objective function 
value in (|3. 3|) which are at least as good as £q. Combining (|3.5p and (|3.6p gives 

|^(x)-%)|<2M (3.7) 

when ^ achieves an objective function value in (|3.3[) which is at least as good as 
£q. Thus, when we carry out the minimization in (|3.3[) . we should restrict to affine 
functions which touch h at some point y G £ and with the change in £ across £ 
satisfying the bound (|3.7p for all x G £. This tells us that the minimization in (|3.3D 
can be restricted to a compact set, and that a minimizer must exist. 

Suppose that £ attains the minimum in (|3.3p . Let z be a point in £ where 
h(x) — £(x) achieves its maximum. A Taylor expansion around x = z gives 

ft(x) - £(x) = /i(z) - <?(z) + (V/i(z) - W)(x - z) - ||x - z|| 2 (3.8) 

since /i(x) = — ||x|| 2 . Since £ £ U, the set given in (|3.2p . we have /i(x) — l'(x) > for 
all x G £, so ([375]) yields 

< /i(z) - £{z) + (V/i(z) - W)(x - z) - ||x - z|| 2 (3.9) 

for all x G £. By the first-order optimality conditions for z, we have 

(V/i(z) - Vf)(x-z) <0 

for all x G £. It follows from (|3.9| that 

0</i(z)-^(z)-||x-z|| 2 , 



or 

h(z) -£(z) > ||x-z|| 2 
for all x G £. Since there exists x G £ such that ||x — z|| > S(£)/2, we have 

max/i(x) - ^(x) = h(z) - £(z) > 5(£ ) 2 /4. (3.10) 

We now observe that for the specific affine function I* given in the statement of 
the theorem, (|3.10|) becomes an equality, which implies the optimality of I* in (|3.3|) . 
Expand in a Taylor series around x = c, where c = A _1 b is the center of the ellipsoid 
£, to obtain 

/i(x) = -||c|| 2 - 2c T (x - c) - ||x - c|| 2 = -2c T x + ||c|| 2 - ||x - c|| 2 . 
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Hence, for £* , we have 

ft(x) - r(x) = ||c|| 2 - 7 - ||x - c!| 2 = || a* - cf - ||x - c|| 2 
= max ||y — c|| 2 — ||x — c|| 2 . 

Clearly, /i(x) — £*(x) > for all x S £, and the maximum over x S £ is attained at 
x = c. Moreover, 

h(c)-£*(c) = max||y-c|| 2 = S(£) 2 /A. 

Consequently, (|3.10p becomes an equality for £ = £* , which implies the optimality of 
£* in (3HJ|. □ 

To evaluate the best afhne underestimate given by Theorem 13.11 we need to 
solve the optimization problem (|3 ,4[) . This amounts to finding the major axis of the 
ellipsoid. The solution is 

H = c + sy, 

where y is a unit eigenvector of A associated with the smallest eigenvalue e, and s is 
chosen so that fi lies on the boundary of the £. From the definition of £, we obtain 

s = J (c T Ac + p)/e. 

We minimize the function Jl in (| 1 . 2[) over £ Oil, with £ the best affine underes- 
timate of — ||x|| 2 , to obtain a lower bound for the objective function over £ Oft. An 
upper bound for the optimal objective function value is obtained by starting any local 
optimization algorithm from the best iterate generated during the computation of the 
lower bound. For the numerical experiments reported later, the gradient projection 
algorithm [IT] is the local optimization algorithm. Of course, by using a faster local 
algorithm, the overall speed of the global optimization algorithm will increase. 

4. Phase one. In each step of the branch and bound algorithm for (P), we need 
to solve a problem of the form 

min /(x) subject to x G £ fl fl, (4-1) 

in the special case where / is convex (the function Jl in (II. 2[l ) and £ is an ellipsoid. 
In order to solve this problem, we often need to find a feasible point. One approach 
for finding a feasible point is to consider the minimization problem 

min x T Ax — 2b T x subject to x G 0, (4.2) 

where A and b are associated with the ellipsoid £ in (|3.ip . Assuming we know a 
feasible point xq £ 11, we could apply an optimization algorithm to (|4.2p . If the 
objective function value can be reduced below p, then we obtain a point in £. If the 
optimal objective function value is strictly larger than p, then the problem (|4.1[) is 
infcasible. 

If the set fl is itself the intersection of ellipsoids, then the procedure we have just 
described could be used in a recursive fashion to determine a feasible point for cither 
fl or £ n fl, if it exists. In particular, suppose fl = DjL^j is the intersection of m 
ellipsoids, where 

£j = {x e W : x T A jX - 2bjx < pj}. 
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A point X! G E\ is readily determined. Proceeding by induction, suppose that we 
have a point ~x. k -\ G C\jZ.\£j. Any globally convergent iterative method is applied to 
the convex optimization problem 

min x T AfcX — 2b Jx subject to x G C\j~\£j. 

If the objective function value is reduced below pk, then a feasible point in Pq—iSj 
has been determined. Conversely, if the optimal objective function value is above p k , 
then r\j =1 £j is empty. 

5. Branch and bound algorithm. Our branch and bound algorithm is pat- 
terned after a general branch and bound algorithm, as appears in [13] for example. 
For any ellipse £ , define 

M L {£) = min {/ L (x) : x G £ n fi}, (5.1) 

where /l is the lower bound (jl.2p corresponding to the best affine underestimate of 
— ||x|| 2 on £. We assume that an algorithm is available to solve the optimization 
problem (|5.1D . 

Ellipsoidal branch and bound with linear underestimate (EBL) 

1. Let £q be an ellipsoid which contains SI and set Sq = {£q}. 

2. Evaluate Ml(£q) and let xo G SI denote the feasible point generated during 
the evaluation of Ml{£q) with the smallest function value. 

3. For k = 0,1,2,... 

(a) Choose £ k G S k such that M/,(£fe) = min{Mr,(£) : £ G S}. Bisect 
£k G Sk with two ellipsoids denoted £ k \ and £^2 (see Section[2|). Evaluate 
M L (£ kl ) and M L (£ k2 ). 

(b) Let Xfe + i denote a feasible point associated with the smallest function 
value that has been generated up to this iteration and up to this step. 
Hence, if y k i and y k2 are solutions to (|5 . 1 1) associated with £ = £ k i and 
£ = £ki respectively, then we have /(x^+i) < /(yfei), i = 1,2. 

(c) Set S k+1 = {£eS k U {£ kl } U {£ fc2 } : M L {£) < f(x k+1 ),E + £ k } 

Theorem 5.1. Suppose that the following conditions hold: 

Al. The feasible set Si is contained in some given ellipsoid £ , SI is compact, and 

f is weakly convex over £ . 
A2. A nested sequence of ellipsoidal bisections shrinks to a point (see Section^). 
Then every accumulation point of the sequence is a solution of (P). 

Proof. Let y denote any global minimizer for (P). We now show that for each 
k, there exists £ e S k with y G £ . Since SI C £q, y G £q. Proceeding by induction, 
suppose that for each j , < j < k, there exists an ellipsoid Tj G Sj with yefj. We 
now wish to show that there exist T k +\ G S k +\ with y G F k +\. In Step 3c, T k G S k 
can only be deleted from S k +i if Mi(jF fc ) > /(x^+i) or T k — £ k . The former case 
cannot occur since 

M L (T k ) < /(y) < /(xfc+i), 

due to the global optimality of y. If T k — £ k , then y lies in either £ kl or £ k2 . If 
y G £ k i, then £ fei G 5 fe+ i since 



M L (£ kl ) < f(y) < /(x fc+1 ). 
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Let x* denote an accumulation point of the sequence Since £1 is closed and 
x. k S ri for each k, x* G 17. By [2"51, Prop. 4.4], a weakly convex function is locally 
Lipschitz continuous. Hence, / is continuous on £1 and f(x. k ) approaches /(x*). If x* 
is a solution of (P), then the proof is complete. Otherwise, /(y) < /(x*). 

For each fe, we have 

min {M L {£) : £ e S k } < M L {T k ) < f(y) < /(x*). (5.2) 

Let Gk denote an ellipsoid which achieves the minimum on the left side of (|5.2[) and 
let y k denote a minimizer in (|5.1|) corresponding to £ = Gk- The inequality (|5.2|) 
reduces to 

jx(y fc ) < /(y) < /(x*). (5.3) 

Since y minimizes / over Q, (|5.3|) implies that 

My*) < /(y) < f(y k ). (5.4) 

By Theorem 13. 1[ 

/(yk) - h(y k ) = -^(||y fc || 2 + *(y*)) < <rS(g k ) 2 /A, (5.5) 

where ^ is the best linear lower bound for the function — ||x|| 2 , and a > is the 
parameter associated with the convex/concave decomposition 

Each ellipsoid £ k corresponds to a vertex on the branch and bound tree associated 
with EBL. Choose the iteration numbers k\ < &2 < ■ ■ ■ so that they correspond to 
vertices along an infinite path on the branch and bound tree, starting from the root 
of the tree. By (A2), S(G ki ) tends to as i tends to infinity. Hence, (|5.5[) implies 
that \f{y ki ) — fh{y ki )\ tends to zero. Combining this with (|5.3[) and (|5.4[) shows 
that f(y ki ) < /(x*) for i sufficiently large, which violates Step 3b and the fact that 
/(x/-_|_i) is the smallest function value at step k and the smallest values monotonically 
approach /(x*). □ 

Note that if for any k, /(x^) = uoxl{Ml{£) : £ € S k }, then x,t is a global 
minimizer. 

6. Ball approximation algorithm for convex optimization. In this section 
we give an algorithm to solve (P) in the special case that / and O are convex. This 
algorithm, which is based on the successive approximation of the feasible set by balls, 
ties in nicely with the ellipsoidal-based branch and bound algorithm. The algorithm 
is a generalization of the ball approximation algorithm [17j of Lin and Han. The 
algorithm of Lin and Han deals with the special case where the objective function has 
the form ||x— a|| 2 and 51 is an intersection of ellipsoids. Lin generalizes this algorithm 
in [16] to treat convex constraints. The analysis in [TBI ttZ] is tightly coupled to the 
norm objective function. In our further generalization of the Lin/Han algorithm, 
the norm objective function is replaced by an arbitrary convex functional / and an 
additional constraint set \ C K n is included, which might represent bound constraints 
for example. More precisely, we consider the problem 

min/(x) subject to x e T := {x e \ '■ g( x ) < 0}> (C) 

where / : M™ — ► M, g : R" — » R m , and the following conditions hold: 

CI. / and g are convex and differentiable, \ is closed and convex, and T is 
compact. 
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C2. There exists x in the relative interior of x with g(x) < 0. 
C3. There exists 7 > such that || V<7j(x)|| > 7 when ^(x) = for some i £ [1, in] 
and x G x- 

The condition C2 is referred to as the Slater condition. 

We will give a new analysis which handles this more general convex problem (C). 
In each iteration of Lin's algorithm in [16j , the convex constraints are approximated 
by ball constraints. Let h : R™ — > R be a convex, differentiable function which defines 
a convex, nonempty set 

H = {xeM": h{x) < 0}. 

The ball approximation Bh{~x) at x € H. is expressed in terms of a center map c : 
H — > R™ and a radius map r : H — > R: 

W = {yel" : ||y-c(x)|| <r(x)}. 

These two maps must satisfy the following conditions: 

Bl. Both c and r are continuous on TL. 

B2. If h(x) < 0, then x € hit B/j(x), the interior of 2?h,(x). 

B3. If /i(x) = 0, then x e dBh(x), and c(x) = x— aV/i(x) T for some fixed a > 0. 

Maps which satisfy Bl, B2, and B3 are the following, assuming h is continuously 
differentiable: 

c(x) = x - aVfe(x) T , r(x) = a||V/i(x)|| - /3/i(x), 

where a and (3 are fixed positive scalars. 

Let Cj and denote center and radius maps associated with <?i, let be the 
associated ball given by 

8,(x) = {yeK" = lly-ci(x)|| <n(x)}, 

and define B(x) = n^ :1 Bi(x). Our generalization of the algorithm of Lin and Han is 
the following: 

Ball approximation algorithm (BAA) 

1. Let Xo be a feasible point for (C). 

2. For k = 0, 1, . . . 

(a) Let yfc be a solution of the problem 

min /(x) subject to x G % n S(x fe ). (6.1) 

(b) Set Xfc+i = x(rfe) where x(t) = (1 — r)xfc + ryu and Tk is the largest 
r € [0, 1] such that x(c) € J 7 for all cr e [0, r]. 

In p76l Lem. 3.1] it is shown that int £>(x) 7^ for each xG f when the center and 
radius maps Cj and satisfy B2 and B3 and there exists x such that g(x) < 0. Lin's 
proof is based on the following observation: For e > sufficiently small, x + e(x — x) 
lies in the interior of 0j(x) for each i. In C2 we also assume that x 6 ri x, where "ri" 
denotes relative interior. Hence, for e > sufficiently small, x + e(x — x) lies in both 
ri x an d in the interior of Sj(x) for each i. Consequently, we have 



ri x H int B(x) 7^ for every xef. 



(6.2) 
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This implies that the subproblems (|6.1|) of BAA are always feasible. An optimal 
solution exists due to the compactness of the feasible set and the continuity of the 
objective function. 

Theorem 6.1. If CI, C2, and C3 hold and the center map c; and the radius 
map ri satisfy Bl, B2, and B3, i = 1,2, ... ,m, then the limit x* of any convergent 
subsequence of iterates Xfc of Algorithm 2 is a solution of (C). 

Proof. Initially, xo E T . Proceeding by induction, it follows from the line search 
in Step 2a of BAA that x t ef for each k. By B2 and B3, x G B h (x) if h(x) < 0. 
Consequently, G x H i?(x/c) for each k. This shows that X& is feasible in (|6.ip for 
each fc, and the minimizer yfc in (|6.ip satisfies 

/(yfc) < /(Xfc) for each k. (6.3) 

By the convexity of / and by (|6 . 3[) . we have 

/(xfc+i) < 7i/(y fc ) + (1 - 7Tb)/(xfc) < /(x fe ), (6.4) 

where Tfc G [0,1] is defined in Step 2b of BAA. Hence, /(x/.) approaches a limit 
monotonically. Since T is compact and Xfc G for each fc, an accumulation point 
x* G exists. Since the center maps Ci and the radius maps are continuous, 
the balls i^(xfc) are uniformly bounded, and hence, the yfc are contained in bounded 
set. Let y* denote an accumulation point of the yfc. To simplify the exposition, let 
(xfc,yfc) denote a pruned version of the original sequence which approaches the limit 
(x*,y*). 

We now show that 

y* = arg min /(x) subject to x e \ n B(x*). (6.5) 

Suppose, to the contrary, that there exists y € x n B(x*) such that /(y) < /(y*). 
Referring to the discussion before (|6.2p . choose x G ri x with x G int B,(x*) for each i. 
Define y = y + e(x — y) where e > is small enough that y € ri x, y € int Bi(x*) for 
each i, and f(y) < f(y*)- For sufficiently large, y € %nS(xfc) due to the continuity 
of the center and radius maps. Since /(yfc) approaches /(y*) > /(y), we contradict 
the optimality of yfc in (|6.ip . This establishes (|6.5p . 

Again, by B2 and B3, x* is feasible in (|6.5p . Since y* is optimal in (16. 5p . we have 
/(y*) < /(x*). We will show that 

/(y*) = /(x*). (6.6) 

Suppose, to the contrary, that f(y*) < /(x*). Since x* G JT, we conclude that for 

each i, one of the following two cases can occur: 

(i) 5i(x*) = 0: In this case, it follows from B3 that x* S 9Sj(x*). Since both 
x* and y* G x n ^( x *), w e have [x*,y*] G x H Bi(x*). Hence, the vector 
y* — x* makes an acute angle with the inward pointing normal at x* . By B3 
the inward pointing normal is a positive multiple of — Vg(x*); it follows that 

-V<?(x*)(y*-x*) >0. 

By a Taylor expansion around x*, we see that there exist cr, G (0, 1) such 
that 



gi(x* + cr(y* - x*)) < for all a G (0,^]. 



(6.7) 
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(ii) <jfj(x*) < 0: In this case, there trivially exists <7j € (0, 1) such that (|6.7p holds. 
Let a* be the minimum of er^, 1 < i < m. By the convexity of /, we have 

/(x* + a*(y* - x*)) < /(x*) + a*(/(y*) - /(x*)) < /(x*) (6.8) 

since /(y*) < /(x*)). Since both x fc and y^ S x H B(x k ), the line segment [x fc ,y fc ] 
is contained in x H fi(xfc). Since x fc approaches x* and y^ approaches y*, it follows 
from (|6.7p that 

Xfc + cr*(y fc - Xfc) € T 

for A; sufficiently large. Again, by the convexity of /, (|6.3|) . and the fact that Tfc is 
taken as large as possible so that 

x fe+ i = (1 - T fe )x fc + r fe y fc G J", 

we have 

/(xfc+i) < /(x fe ) + r fe (/(y fc ) - /(xfc)) < /(xfc) + <7*(/(y fe ) - /(x fe )). (6.9) 
Since (xfc,yfc) converges to (x*,y*), it follows from (|6.8p that 

lim /(x fc ) + tr*(/(y fc ) - /(x fe )) - /(x*) + a*(/(y*) - /(x*)) < /(x*). (6.10) 

fc — >oc 

Hence, for k sufficiently large, (|6.9p and (|6.10p imply that /(xfe+i) < /(x*), which 
contradicts the monotone decreasing convergence (|6.4p of /(x^) to /(x*). This com- 
pletes the proof of (|6.6p . 

Let L : R m +" R be the Lagrangian defined by 

L(X, x) = /(x) + - J2 A* (||x - ^(x*)!! 2 - r,(x*)) . 

i=l 

Since y* is a solution of (|6.5p and the Slater condition (|6.2p holds, the first-order 
optimality condition holds at y* . That is, there exist A* G M m such that 



A*>0, A*(||y*-c ! (x*)|| 2 -r 2 (x*)) = 0, i=l,2,. 
V XJ L(A*,y*)(x - y*) > for all x e X- 



(6.11) 



If V/(y*)(x — y*) > for all x£ x, then y* is the global minimizer of the convex 
function / over \- Since f(y*) = /(x*) by (|6.6p . it follows that x* is a solution of 
(C), and the proof would be complete. Hence, we suppose that V/(y*)(x — y*) < 
for some xg^, which implies that A* =^ by (|6.1ip . 
Since / is convex, we have 

/(x*) > /(y*) + V/(y*)(x* - y*). (6.12) 

We expand the expression 



-^A.dlx-c^lp-rKx*)) 



2 

i=l 
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in a Taylor series around x = y* and evaluate at x = x* to obtain 

m m 

- £ A, (||x* - Cl (x*)|| 2 - r 2 (x*)) = - J2 A* (lly* - c,(x*)|| 2 - r 2 (x*)) 

i=l i=l 

m 1 m 

+ £ A 4 (y* - c,(x*)) T (x* - y*) + -||x* - y*|| 2 £ A*. 
t=l z=l 

We add this equation to (|6.12p to obtain 

m 

L(x*, A*) > L(y*, A*) + V x L(y*, A*)(x* - y*) + -||x* - y*|| 2 ]T A*. (6.13) 

i=l 

By complementary slackness and by (|6.6p . we have L(y*, A*) = /(y*) = /(x*). Hence, 
(|5TT5|) yields 

-||x* - y*|| 2 J] A* < -V a L(y*, A*)(x* - y*) 

2=1 

1 m 

+ -^A:(||x*-c J (x*)|| 2 -r 2 (x*)). (6.14) 
i=i 

By (|6.11[) and the fact that x* € X, we have V x L{y*, A*)(x* - y*) > 0. Since 
x* e B(x*), the last term in (|6.14|) is nonpositive. Hence, the entire right side of 
([6~T4]) is nonpositive. Since A* > and A* ^ 0, (|6~T4]l implies that y* = x*. 
Replacing y* by x* in the first-order conditions (|6.11[) gives 

( V/(x*) + A **( x * _ C *( X *)) T ^ ( x - x *) > for a11 x e X- (6.15) 

If <7i(x*) < 0, then by B2, x* £ hit Bj(x*) and A* = by complementary slackness. 
If ft(x*) = 0, then by B3, c t (x*) = x* - aVg^x*) 1 ". With these substitutions, ||6~T5|) 
yields 

^V/(x*) + a ][J A|Vffi(x*)^ (x - x*) > for all x e X- 

Hence, the first-order optimality conditions for (C) are satisfied at x*. Since the 
objective function and the constraints of (C) are convex, x* is a solution of (C). This 
completes the proof. □ 

7. Numerical experiments. We investigate the performance of the algorithms 
of the previous sections using randomly generated quadratically constrained quadratic 
programming problems of the form 

minx T Aox + bjx subject to g(x) < 0, (QP) 

where x £ R™ and <?i(x) = x T A;x + bjx + a, i — 1,2, ... ,m. Here b^ £ R" and 
Cj is a scalar for each i. The matrices are symmetric, positive definite for i > 1. 
In our experiments with the ball approximation algorithm, we take Ao symmetric, 
positive semidcfinitc. In our experiments with the branch and bound algorithm, we 
consider more general indefinite Ao. The codes are written in either C or Fortran. 
The experiments were implemented using a Matlab 7.0.1 interface on a PC with 2GB 
memory and Intel Core 2 Duo 2Ghz processors running the Windows Vista operating 
system. 
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7.1. Rate of convergence for BAA. The theory of Section [5] establishes the 
convergence of BAA. Experimentally, we observe that the convergence rate is linear. 
Figure 17.11 shows that the behavior of the KKT error as a function of the iteration 
number for a randomly generated positive definite matrix Ao of dimension 200 and for 
4 ellipsoidal constraints (m = 4) . The KKT error is computed using the formula given 
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Fig. 7.1. KKT error versus iteration number for n = 200, m = 4, and Aq positive definite 



in Section 4 of 10J. Roughly, this formula amounts to the infinity norm of the gradient 
of the Lagrangian plus the infinity norm of the violation in complementary slackness. 
If A is constructed to have precisely one zero eigenvalue, then the convergence rate 
again appears to be linear, as seen in Figure [7T21 

7.2. Comparison with other algorithms for programs with convex cost. 

To gain some insight into the relative performance of the ball approximation algorithm 
(BAA), we solved randomly generated problems with convex cost using three other 
algorithms: 

• SEDUMI, for optimization over symmetric cones. 

• The gradient projection algorithm. We tried both the nonmonotone gradient 
project algorithm (NGPA) given in [11) and the nonmonotone spectral pro- 
jected gradient method (SPG) of Birgin, Martinez, and Raydan [51 [3] (ACM 
Algorithm 813). 

We now discuss in detail how each of these algorithms was implemented. The 
BAA subproblems (j6.1[) have the form 



min x t Aqx 



bjx 



subject to || x — Ci 



1 < i < m. 



(7.1) 



We solve these subproblems by applying the active set algorithm (ASA) developed 
in [TT] to the dual problem. To facilitate the evaluation of the dual function, we 
compute the diagonalization An = QDQ T where D is diagonal and Q is orthogonal. 
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Fig. 7.2. KKT error versus iteration number for n = 200, m = 4, and a positive semidefinite Ao. 

Substituting x = Qy in (|7.1[) yields the equivalent problem 

miny T Dy + b^Qy subject to ||y - Q T cJ 2 < rf , 1 < i < m. 
The dual problem is 



max 



x min y^y + bjQy + ^TA.Oly-Q^II 2 -^ 2 ). (7.2) 



The i-th component of the gradient of the dual function with respect to A is simply 
||y(A) — Q T Cj|| 2 — rf where y(A) achieves the minimum in (|7.2p . This minimum is 
easily evaluated since the quadratic term in the objective function is diagonal. 

SEDUMI could be applied directly to (QP) when the cost function is strongly 
convex. We used Version 1.1 of the code obtained from 



http://sedumi.mcmaster.ca/ 



In implementing the gradient projection algorithm for (QP), we need to project a 
vector onto the feasible set. This amounts to solving a problem of the form 

min ||x — a|| 2 subject to g(x) < 0. 

We solved this problem using BAA. An iteration of BAA reduces to the solution of a 
problem with the following structure: 

min ||x - a|| 2 subject to ||x - cj 2 < r 2 , i=l,2, ...,m. (7.3) 

As in [16], we solve these problems by forming the dual problem 

m 

max min ||x - a|| 2 + V A 4 (||x - || 2 - r?) . 

A>0 x£R" t-^l V ' 
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Table 7.1 
Positive definite cases 



n, m 




SED 


success 


BAA 


success 


NGPA 


success 


SPG 


success 


100,4 


time 
iter 


0.52 
10.06 


28 


0.07 
19.00 


30 


4.70 
172.43 


17 


5.94 
200.06 


17 


200,4 


time 
iter 


2.75 
9.56 


26 


0.32 
12.70 


30 


10.68 
203.23 


21 


11.76 
348.36 


20 


300,4 


time 
iter 


8.14 
9.60 


27 


0.72 
21.46 


30 


128.19 
269.86 


20 


122.23 
431.83 


20 


400,4 


time 
iter 


20.13 
10.26 


27 


1.64 
49.26 


29 


404.28 
352.66 


18 


438.84 
545.23 


18 


500,4 


time 
iter 


44.07 
12.33 


28 


2.54 
29.90 


30 


579.21 
369.20 


15 


647.56 
574.13 


13 


600,4 


time 
iter 


57.28 
9.30 


26 


4.27 
36.80 


29 


648.79 
309.43 


19 


611.60 
306.50 


19 


100,40 


time 
iter 


3.51 
10.26 


28 


0.08 
19.00 


30 


86.89 
150.76 


21 


81.67 
165.66 


21 


200,40 


time 
iter 


26.56 
12.70 


30 


0.32 
12.70 


30 


268.63 
199.70 


17 


250.22 
218.50 


16 


200,100 


time 
iter 


54.56 
10.66 


30 


0.81 
9.50 


30 


732.50 
295.80 


20 


727.92 
327.26 


20 


100,200 


time 
iter 


23.84 
14.96 


30 


0.72 
20.06 


30 


579.62 
249.43 


18 


530.02 
261.46 


19 


4,100 


time 
iter 


0.093 
9.06 


29 


0.002 
6.96 


30 


3.02 
19.46 


26 


2.75 
19.40 


25 


4,200 


time 
iter 


0.114 
9.56 


27 


0.004 
8.73 


30 


6.23 
16.26 


26 


5.70 
16.46 


26 


4,300 


time 
iter 


0.148 
11.06 


25 


0.012 
12.56 


30 


13.45 
15.26 


24 


11.58 
15.33 


23 


4,400 


time 
iter 


0.195 
13.33 


28 


0.014 
12.26 


30 


16.27 
16.26 


28 


12.87 
15.70 


28 


4,500 


time 
iter 


0.221 
13.83 


26 


0.017 
11.50 


30 


21.08 
13.83 


26 


18.16 
13.83 


26 


4,600 


time 
iter 


0.235 
12.13 


26 


0.018 
11.00 


30 


31.65 
15.40 


24 


34.83 
16.33 


24 



After carrying out the inner minimization, this reduces to 



max- l|a +^ A f l! + £ MINI' - r i)- ( 7 - 4 ) 



If A solves the dual problem (17. 4p , then the associated solution of the primal problem 
CZD is 

Again, the dual problem (|7.4p is solved using the active set algorithm (ASA) of [TT] . 

The test problems used in Tables 17711 and F7721 were generated as follows: Let 
Rand(n, I, u) denote a vector in W 1 whose entries are chosen randomly in the interval 
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(l,u). Random positive definite matrices A are generated using the procedure given 
in [17j . which we now summarize. Let w, ; S Rand(n, —1, 1) for i = 1, 2, 3, and define 

Q 4 = I - 2v i v 1 ~, Vj = Wt/||wi||. 

Let D be a diagonal matrix with diagonal in Rand(n,0, 100). Finally, A = UDU T 
with U = Q1Q2Q3. To obtain a randomly generated positive semidcfinitc matrix, 
we use the same procedure, however, we randomly set one diagonal element of D to 
zero. 

We make a special choice for Cj to ensure that the feasible set for (QP) is nonempty. 
We first generate p € Rand(n, —50, 50) and we set 

Cj = -(p T A 4 p + bjp + s^, 

where s$ is randomly generated in the interval [0,10] and bj £ Rand(n, — 100, 100). 
With this choice for a, the feasible set for (QP) is nonempty since p lies in the interior 
of the feasible set. The stopping criterion in our experiments was 

||P(x fc -g fc )-x fc || < HT 4 , (7.5) 

where P denotes projection into the feasible set for (QP) and = 2AoXfc + bo is 
the gradient of the objective function at x^. When the cost is convex, the left side of 
(|7.5[) vanishes if and only if is a solution of (QP). 

Tables I7TT1 and 1731 report the average CPU time in seconds {time), the average 
number of iterations (iter), and the number of successes in 30 randomly generated 
test problems. The algorithm was considered successful if the error tolerance (|7.5|) 
was satisfied. 

Based on our numerical experiments, it appears that BAA can achieve an error 
tolerance on the order of the square root of the machine epsilon [§1 [M] , similar to the 
computing precision which is achieved by interior point methods for linear program- 
ming prior to simplex crossover. The convergence tolerance (|T. 5[) was chosen since 
it seems to approach the maximum accuracy which could be achieved by BAA in 
these test problems. Numerically, BAA seems to terminate when the solution to the 
subproblcm (|6.1[) yields a direction which departs from the feasible set, and hence, 
the stepsize in the line search Step 2b is zero. We were able to achieve a further 
improvement in the solution by taking a partial step in this infeasible direction since 
the increase in constraint violation was much less than the improvement in objective 
function value. Nonetheless, the improvement in accuracy achieved by permitting 
infeasibility was at most one digit in our experiments. 

In Tables 17.11 and 17.21 we see that BAA gave the best results for this test set, 
both in terms of CPU time and in terms of successes (the number of times that 
the convergence tolerance (|7.5[) was achieved). Recall that the gradient projection 
algorithms in our experiments used BAA to compute the projected gradient. The 
convergence failures for the gradient projection algorithms in Tables [77X1 and [721 were 
due to the fact that BAA was unable to compute the projected gradient with enough 
accuracy to yield descent in the gradient projection algorithm. 

7.3. Problems with nonconvex cost. We tested our ellipsoidal branch and 
bound algorithm using some randomly generated test problems with Ao indefinite. 
To compute fi in (|3.4j) . we used the power method (see [24]) to find the eigenvector 
associated with the largest eigenvalue. We chose A in (|1.2p to be 0.1 minus the smallest 
eigenvalue of Aq. NGPA was used to locally solve (QP) and update the upper bound. 
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Table 7.2 
Positive semidefinite cases 



n, m 




BAA 


success 


NGPA 


success 


SPG 


success 


100,4 


time 
iter 


0.11 
42.16 


30 


15.04 
328.53 


22 


17.38 
408.20 


19 


200,4 


time 
iter 


0.55 
99.33 


30 


44.32 
313.10 


20 


45.82 
356.43 


22 


300,4 


time 
iter 


1.02 
74.23 


30 


290.19 
374.13 


22 


304.89 
417.60 


21 


400,4 


time 
iter 


2.28 
111.83 


30 


501.14 
404.66 


19 


572.63 
492.83 


19 


500,4 


time 
iter 


5.37 
200.03 


27 


620.13 
382.66 


16 


657.61 
478.60 


17 


100,40 


time 
iter 


0.61 
82.30 


30 


356.40 
276.56 


22 


321.92 
237.23 


21 


200,40 


time 
iter 


2.74 
127.23 


30 


398.54 
369.43 


17 


415.28 
416.060 


17 


100,200 


time 
iter 


3.19 
108.63 


28 


1030.02 
311.40 


19 


949.29 
352.23 


18 


4,100 


time 
iter 


0.054 
38.66 


30 


16.74 
31.13 


16 


14.25 
31.23 


16 


4,200 


time 
iter 


0.075 
43.46 


27 


44.74 
26.20 


13 


33.24 
23.36 


18 


4,300 


time 
iter 


0.076 
31.50 


29 


111.03 
29.33 


14 


100.38 
28.60 


11 


4,400 


time 
iter 


0.049 
36.73 


29 


205.17 
27.20 


18 


237.62 
31.23 


18 


4,500 


time 
iter 


0.065 
41.86 


28 


229.77 
24.46 


16 


247.82 
26.30 


17 



We took m = 2 and randomly generated test problem using the procedure in pQ . 
That is, the ellipsoidal constraint functions in (QP) have the form 

ffi (x) = (x-CifBr^x-c;)-!, 

where B 4 = UD,U T and U is as given earlier. is a diagonal matrix with its 
diagonal in Rand(n, 0, 60), Ci e Rand(n, 0, 100), and C2 = Ci + .8v where v is the semi- 
major axis of the ellipsoid <?i(x) < 0. For this choice of C2, the ellipsoids g\ (x) < 
and 52 (x) < have nonempty intersection at x = C2- In the objective function, 
Ao = UDU T where D is a diagonal matrix with diagonal in Rand(n, —30, 30) and 
bo G Rand(n, — 1, 1). The case m = 2 is especially important since quadratic problems 
with two ellipsoidal constraints belong to the class of Celis-Dennis-Tapia subproblcms 
[1] which arise from the application of the trust region method for equality constrained 
nonlinear programming [TH HH1 M H3 E3 HS1 US] • 

If UBfc and LB^ are the respective upper and lower bounds for the optimal ob- 
jective function value at iteration fe, then our stopping criterion was 

UB fc - LB fc < max{e a , e r |LB fc |}, 

with e a = 10~ 5 and e r = 10~ 2 . 

We considered problems of 8 different dimensions ranging from 30 up to 300 as 
shown in Table 17.31 For each dimension, we solved 4 randomly generated problems. 
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Table 17.31 shows the numerical results for our test instances, where "neigs" is the 
number of negative eigenvalues of the objective function, and "ub 1 " are the 

lower bound and upper bounds at the first step, li val" is the computed optimal value 
and tl it n is the number of iterations. We also report the performance of the algorithm 
for m = 6 in Table [7T4l 

Table 7.3 

The performance of branch and bound algorithm for m = 2 



n 


neigs 


lb 1 


ub' 


val 


it 


time 


30 


12 


34827.3 


35256.3 


35254.8 


5 


1.75 




17 


-41212.1 


-40746.2 


-40748.8 


21 


3.58 




14 


38601.4 


38977.6 


38977.6 





0.72 




17 


-31534.2 


-31108.4 


-31119.8 


92 


8.82 


50 


22 


-357168.8 


-356828.8 


-356828.8 





0.52 




21 


-33792.9 


-33447.9 


-33447.9 


1 


0.84 




21 


-29694.6 


-29254.1 


-29255.2 


247 


23.08 




26 


35034.0 


35416.6 


35414.8 


5 


2.12 


60 


29 


17783.6 


18227.4 


18227.4 


78 


22.41 




26 


-27498.2 


-27110.5 


-27110.5 


69 


18.22 




30 


-69845.7 


-69463.1 


-69463.1 





0.56 




28 


20408.7 


20963.1 


20927.1 


273 


42.65 


100 


50 


-11495.2 


-11196.9 


-11218.9 


56 


30.72 




51 


17539.6 


17909.3 


17909.3 


4 


1.84 




52 


-46065.5 


-45653.2 


-45653.2 





0.88 




40 


970829.8 


971326.6 


971326.6 





0.92 


150 


75 


-302382.2 


-302071.0 


-302071.0 





0.95 




83 


29089.4 


29500.8 


29500.8 


64 


31.45 




72 


16580.5 


16904.9 


16904.9 


1 


1.98 




73 


-32461.9 


-32036.1 


-32036.1 


1 


1.37 


200 


100 


10798.5 


11226.1 


11226.1 


81 


56.58 




95 


-27242.9 


-26792.1 


-26792.1 


2 


2.27 




100 


35293.0 


35862.1 


35862.1 


1 


1.63 




96 


-31712.8 


-31138.3 


-31138.3 


77 


47.06 


250 


135 


37015.8 


37477.6 


37477.6 


1 


2.9 




131 


-27278.9 


-26563.0 


-26780.0 


86 


88.40 




128 


-9979.6 


-9683.9 


-9683.9 


59 


131.54 




121 


-371385.9 


-370991.9 


-370991.9 





2.03 


300 


145 


-162041.5 


-161645.7 


-161645.7 





5.33 




152 


-48085.4 


-47529.3 


-47529.3 


1 


7.56 




138 


226345.6 


226377.8 


226377.8 





4.79 




148 


-17649.5 


-17013.7 


-17323.2 


109 


257.52 



In comparing our ellipsoidal branch and bound algorithm based on linear under- 
estimation (EBL) to the ellipsoidal branch and bound algorithm of Le Thi Hoai An 
PQ based on dual underestimation (EBD), an advantage of EBD is that the under- 
estimates are often quite tight in the dual-based approach. As seen in Table 17. 31 
EBL required up to 273 bisections for this test set while EBD in [1] was able to 
solve randomly generated test problems without any bisections. On the other hand, 
a disadvantage of EBD is that the dual problems are nondifferentiable when Ao is 
indefinite. Consequently, the evaluation of the lower bound using EBD entails solving 
an optimization problem which, in general, is nondifferentiable. With EBL, how- 
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ever, computing a lower bound involves solving a convex optimization problem. To 
summarize, EBD provides tight lower bounds using a nondifferentiable optimization 
problem for the lower bound, while EBL provides less tight lower bounds using a 
convex optimization problem for the lower bound. 



Table 7.4 

The performance of branch and bound algorithm for m = 6 



n 


neigs 


lb 1 


life 1 


val 


it 


time 


30 


17 


22717.1 


22993.1 


22993.1 


1 


2.41 




21 


-22847.0 


-22520.4 


-22524.5 


14 


10.58 




20 


-17858.2 


-17573.2 


-17573.2 


1 


1.84 


60 


33 


-21818.1 


-21489.1 


-21489.2 


33 


21.64 




27 


47683.8 


47826.0 


47826.0 





2.82 




31 


-4926.5 


-4652.0 


-4728.7 


4 


7.62 


100 


56 


-35438.9 


-35411.0 


-35411.0 





0.78 




52 


-1740.1 


-1187.5 


-1198.2 


354 


283.25 




49 


-6756.5 


-6148.9 


-6148.9 


3 


8.06 



8. Conclusions. A globally convergent branch and bound algorithm was devel- 
oped in which the objective function was written as the difference of convex functions. 
The algorithm was based on an affine underestimate given in Theorem [3J] for the con- 
cave part of the objective function restricted to an ellipsoid. An algorithm of Lin and 
Han [TBI HZ] f° r projecting a point onto a convex set was generalized so as to replace 
their norm objective by an arbitrary convex function. This generalization could be 
employed in the branch and bound algorithm for a general objective function when the 
constraints are convex. Numerical experiments were given for a randomly generated 
quadratic objective function and randomly generated convex, quadratic constraints. 
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